Model Selection

Swin-BART Architecture

# Swin-BART Architecture

Donut Receipts Extract

A specialized receipt text extraction model based on the Donut architecture, achieving OCR-free document understanding through visual encoder and text decoder

Uae License Detection

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder to process document images

Donut Base Finetuned Invoices

Multilingual invoice processing model optimized based on Donut architecture, capable of extracting key invoice fields

Donut Base Finetuned Zhtrainticket

Donut model fine-tuned on ZhTrainTicket for document image-to-text conversion without OCR processing.

Donut Base Finetuned Cord V1 2560

Donut is an OCR-free document understanding Transformer model that combines a visual encoder with a text decoder to achieve image-to-text conversion.

Donut Base Finetuned Docvqa

Donut is an OCR-free document understanding Transformer model, fine-tuned on the DocVQA dataset, capable of directly extracting and comprehending text information from images.

Donut Base Finetuned Rvlcdip

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder to process document images.

Donut is an OCR-free document understanding Transformer model that combines a visual encoder and text decoder for image-to-text conversion

Donut is an OCR-free document understanding Transformer model composed of a visual encoder (Swin Transformer) and a text decoder (BART).

Donut Base Finetuned Cord V2

Donut is an OCR-free document understanding Transformer model composed of a visual encoder (Swin Transformer) and a text decoder (BART), capable of directly extracting text information from images.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase